智能论文笔记

A Tightly Coupled LiDAR-IMU Odometry through Iterated Point-Level Undistortion

Keke Liu , Hao Ma , Zemin Wang

分类：机器人 | 计算机视觉

2022-09-25

扫描不合适是在高旋转和翻译速度的高动态环境中发光镜镜的关键模块。现有的研究线主要集中在一个通道上，这意味着在整个LIDAR-MIMU绕线管道中，每个点的不合时件仅进行一次。在本文中，我们提出了一个基于优化的紧密耦合激光胶-IMU的探光仪，以解决迭代点级的不合适。通过将LIDAR和IMU测量结果得出的成本共同最大程度地减少，我们的LIDAR-IMU检射法在高动态环境中的性能更加准确和健壮。此外，方法字符通过限制参数数量来良好的计算效率。

translated by 谷歌翻译

Research: Modeling Price Elasticity for Occupancy Prediction in Hotel Dynamic Pricing

Fanwei Zhu , Wendong Xiao , Yao Yu , Ziyi Wang , Zulong Chen , Quan Lu , Zemin Liu , Minghui Wu , Shenghua Ni

分类：机器学习

2022-08-04

需求估计在动态定价中起着重要的作用，在动态定价中，可以通过基于需求曲线最大化收入来获得最佳价格。在在线酒店预订平台中，房间的需求或占用率随着房间类型而变化，随着时间的推移变化，因此获得准确的占用估算是一项挑战。在本文中，我们提出了一种新颖的酒店需求功能，该功能明确地模拟了对占用预测需求需求的价格弹性，并设计了价格弹性预测模型，以了解各种影响因素的动态价格弹性系数。我们的模型由精心设计的弹性学习模块组成，以减轻内生性问题，并在多任务框架中接受培训以解决数据稀疏性。我们在现实世界数据集上进行了全面的实验，并验证方法优于最先进的基准，以实现占用预测和动态定价。

translated by 谷歌翻译

Scalar is Not Enough: Vectorization-based Unbiased Learning to Rank

Mouxiang Chen , Chenghao Liu , Zemin Liu , Jianling Sun

分类：人工智能 | 机器学习

2022-06-03

公正的学习排名（ULTR）旨在从有偏见的用户点击日志中训练公正的排名模型。当前的大多数超级方法基于检查假设（EH），该假设假设可以将点击概率分解为两个标量函数，一种与排名特征有关，另一个与偏见因素有关。不幸的是，在实践中，特征，偏见因素和点击之间的相互作用很复杂，通常不能以这种独立的方式分解。使用EH拟合点击数据可能会导致模型错误指定并带来近似错误。在本文中，我们提出了一个基于向量的EH，并将点击概率作为两个向量函数的点产物提出。该解决方案由于其在拟合任意点击功能方面的普遍性而完成。基于它，我们提出了一个名为Vectorization的新型模型，以通过将嵌入在基础向量上投射到基础向量上，以适应性地学习相关性嵌入和排序文档。广泛的实验表明，我们的方法在复杂的真实点击以及简单的模拟点击上大大优于最新的超级方法。

translated by 谷歌翻译

Svadhyaya system for the Second Diagnosing COVID-19 using Acoustics Challenge 2021

Deepak Mittal , Amir H. Poorjam , Debottam Dutta , Debarpan Bhattacharya , Zemin Yu , Sriram Ganapathy , Maneesh Singh

分类：机器学习

2022-06-11

该报告描述了用于在第二次DICOVA挑战中使用三种不同的声学模态（即语音，呼吸和咳嗽）来检测COVID-19阳性的系统。所提出的系统基于4种不同方法的组合，每种方法都集中在问题的一个方面上，并在呼吸，咳嗽和语音轨道上分别达到86.41、77.60和84.55的盲试AUC，并且这三个轨道的融合中的AUC为85.37。

translated by 谷歌翻译

A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Wei Wang , Xiang-Gen Xia , Chuanjiang He , Zemin Ren , Jian Lu

分类：计算机视觉

2022-01-07

在本文中，我们提出了一种新的GPU实现了螺旋CT重建的Katsevich算法。我们的实现划分了宿函数，并通过音高来重建CT图像间距。通过利用katsevich算法参数的周期性属性，我们的方法只需要为所有音高计算这些参数一次，因此GPU-Memory负担较低，非常适合深度学习。通过将我们的实现嵌入到网络中，我们提出了一种具有稀疏探测器的高音高螺旋CT重建的端到端深网络。由于我们的网络利用了来自SINOGAGAMS和CT图像中提取的特征，因此它可以同时减少由SINOGRAMS的稀疏性引起的条纹伪像，并在CT图像中保持细节。实验表明，我们的网络在主观和客观评估中表明了相关方法。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

Rethinking Mobile Block for Efficient Neural Models

Jiangning Zhang , Xiangtai Li , Jian Li , Liang Liu , Zhucun Xue , Boshen Zhang , Zhengkai Jiang , Tianxin Huang , Yabiao Wang , Chengjie Wang

分类：计算机视觉

2023-01-03

This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.

translated by 谷歌翻译

Cluster-guided Contrastive Graph Clustering Network

Xihong Yang , Yue Liu , Sihang Zhou , Siwei Wang , Wenxuan Tu , Qun Zheng , Xinwang Liu , Liming Fang , En Zhu

分类：机器学习

2023-01-03

Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.

translated by 谷歌翻译